EDA - Exploratory Data Analysis¶

importing data¶

Loaded df_appearances from ../pickles/df_appearances.pkl
Skipping player_performance...
Skipping player_game_team_mapping...
Skipping df_games_odds...
Loaded df_teamstats from ../pickles/df_teamstats.pkl
Loaded df_shots from ../pickles/df_shots.pkl
Loaded gameresult from ../pickles/gameresult.pkl
Loaded df_after_outliers_missing from ../pickles/df_after_outliers_missing.pkl
Skipping team_performance...
Loaded df_with_categories from ../pickles/df_with_categories.pkl
Loaded df_num_after_EDA from ../pickles/df_num_after_EDA.pkl
Skipping df_games...
Loaded manipulated_data_no_outleirs from ../pickles/manipulated_data_no_outleirs.pkl
Loaded player_shots from ../pickles/player_shots.pkl
Loaded df_after_EDA from ../pickles/df_after_EDA.pkl
Loaded teamstats from ../pickles/teamstats.pkl
Loaded df_combined from ../pickles/df_combined.pkl
<class 'pandas.core.frame.DataFrame'>
Int64Index: 12680 entries, 0 to 12679
Data columns (total 47 columns):
 #   Column                    Non-Null Count  Dtype  
---  ------                    --------------  -----  
 0   gameID                    12680 non-null  int64  
 1   leagueID                  12680 non-null  int64  
 2   season                    12680 non-null  int64  
 3   date                      12680 non-null  object 
 4   homeTeamID                12680 non-null  int64  
 5   awayTeamID                12680 non-null  int64  
 6   homeGoals                 12680 non-null  int64  
 7   awayGoals                 12680 non-null  int64  
 8   homeGoalsHalfTime         12680 non-null  int64  
 9   awayGoalsHalfTime         12680 non-null  int64  
 10  home_xGoals               12680 non-null  float64
 11  home_shots                12680 non-null  int64  
 12  home_shotsOnTarget        12680 non-null  int64  
 13  home_deep                 12680 non-null  int64  
 14  home_ppda                 12680 non-null  float64
 15  home_fouls                12680 non-null  int64  
 16  home_corners              12680 non-null  int64  
 17  home_yellowCards          12679 non-null  float64
 18  home_redCards             12680 non-null  int64  
 19  home_total_assists        12680 non-null  int64  
 20  home_total_xAssists       12680 non-null  float64
 21  home_total_key_passes     12680 non-null  int64  
 22  home_total_xGoalsChain    12680 non-null  float64
 23  home_total_xGoalsBuildup  12680 non-null  float64
 24  home_total_yellow_cards   12680 non-null  int64  
 25  home_total_red_cards      12680 non-null  int64  
 26  home_total_blocked_shots  12677 non-null  float64
 27  home_total_saved_shots    12677 non-null  float64
 28  away_xGoals               12680 non-null  float64
 29  away_shots                12680 non-null  int64  
 30  away_shotsOnTarget        12680 non-null  int64  
 31  away_deep                 12680 non-null  int64  
 32  away_ppda                 12680 non-null  float64
 33  away_fouls                12680 non-null  int64  
 34  away_corners              12680 non-null  int64  
 35  away_yellowCards          12680 non-null  float64
 36  away_redCards             12680 non-null  int64  
 37  away_total_assists        12680 non-null  int64  
 38  away_total_xAssists       12680 non-null  float64
 39  away_total_key_passes     12680 non-null  int64  
 40  away_total_xGoalsChain    12680 non-null  float64
 41  away_total_xGoalsBuildup  12680 non-null  float64
 42  away_total_yellow_cards   12680 non-null  int64  
 43  away_total_red_cards      12680 non-null  int64  
 44  away_total_blocked_shots  12672 non-null  float64
 45  away_total_saved_shots    12672 non-null  float64
 46  gameresult                12680 non-null  object 
dtypes: float64(16), int64(29), object(2)
memory usage: 4.6+ MB
gameID leagueID season date homeTeamID awayTeamID home_Goals away_Goals home_GoalsHalfTime away_GoalsHalfTime ... away_total_assists away_total_xAssists away_total_key_passes away_total_xGoalsChain away_total_xGoalsBuildup away_total_yellow_cards away_total_red_cards away_total_blocked_shots away_total_saved_shots gameresult
0 81 1 2015 2015-08-08 15:45:00 89 82 1 0 1 0 ... 0 0.586365 7 1.745371 0.811549 3 0 3.0 4.0 H
1 82 1 2015 2015-08-08 18:00:00 73 71 0 1 0 0 ... 1 0.560695 4 1.238205 0.736815 4 0 2.0 2.0 A
2 83 1 2015 2015-08-08 18:00:00 72 90 2 2 0 1 ... 1 0.418385 8 1.959323 1.030588 2 0 3.0 3.0 D
3 84 1 2015 2015-08-08 18:00:00 75 77 4 2 3 0 ... 2 1.288886 9 7.622863 5.617276 4 0 2.0 3.0 H
4 85 1 2015 2015-08-08 18:00:00 79 78 1 3 0 1 ... 3 2.050685 10 10.799517 8.554974 0 0 2.0 4.0 A
... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ...
12675 16131 5 2020 2021-05-23 19:00:00 168 166 1 2 1 1 ... 1 0.307960 4 1.223212 0.715843 2 0 1.0 3.0 A
12676 16132 5 2020 2021-05-23 19:00:00 177 176 1 2 1 1 ... 1 0.775388 7 2.610665 1.758012 1 0 4.0 3.0 A
12677 16133 5 2020 2021-05-23 19:00:00 163 235 2 0 1 0 ... 0 0.216965 6 0.884652 0.544502 0 0 0.0 2.0 H
12678 16134 5 2020 2021-05-23 19:00:00 175 181 0 1 0 1 ... 1 0.565077 6 1.256511 0.764512 0 0 1.0 1.0 A
12679 16135 5 2020 2021-05-23 19:00:00 225 179 1 1 1 0 ... 1 0.470476 4 0.502347 0.421488 1 0 2.0 0.0 D

12680 rows × 47 columns

Data visualization¶

No description has been provided for this image
No description has been provided for this image
No description has been provided for this image
No description has been provided for this image
No description has been provided for this image
No description has been provided for this image
No description has been provided for this image
No description has been provided for this image
No description has been provided for this image
No description has been provided for this image
No description has been provided for this image
No description has been provided for this image
No description has been provided for this image
No description has been provided for this image
No description has been provided for this image
No description has been provided for this image
No description has been provided for this image
No description has been provided for this image
No description has been provided for this image
No description has been provided for this image
No description has been provided for this image
No description has been provided for this image
No description has been provided for this image
No description has been provided for this image
No description has been provided for this image
No description has been provided for this image
No description has been provided for this image
No description has been provided for this image
No description has been provided for this image
No description has been provided for this image
No description has been provided for this image
No description has been provided for this image
No description has been provided for this image
No description has been provided for this image
No description has been provided for this image
No description has been provided for this image
No description has been provided for this image
No description has been provided for this image
No description has been provided for this image
No description has been provided for this image
No description has been provided for this image
No description has been provided for this image
No description has been provided for this image
No description has been provided for this image
No description has been provided for this image

DateTime¶

0       2015-08-08 15:45:00
1       2015-08-08 18:00:00
2       2015-08-08 18:00:00
3       2015-08-08 18:00:00
4       2015-08-08 18:00:00
                ...        
12675   2021-05-23 19:00:00
12676   2021-05-23 19:00:00
12677   2021-05-23 19:00:00
12678   2021-05-23 19:00:00
12679   2021-05-23 19:00:00
Name: date, Length: 12680, dtype: datetime64[ns]

Categorials¶

0        H
1        A
2        D
3        H
4        A
        ..
12675    A
12676    A
12677    H
12678    A
12679    D
Name: gameresult, Length: 12680, dtype: object
<class 'pandas.core.frame.DataFrame'>
Int64Index: 12680 entries, 0 to 12679
Data columns (total 1 columns):
 #   Column      Non-Null Count  Dtype   
---  ------      --------------  -----   
 0   gameresult  12680 non-null  category
dtypes: category(1)
memory usage: 111.6 KB
<Figure size 2500x1000 with 0 Axes>
No description has been provided for this image

Continues (numeric)¶

(12680, 40)
<class 'pandas.core.frame.DataFrame'>
Int64Index: 12680 entries, 0 to 12679
Data columns (total 40 columns):
 #   Column                    Non-Null Count  Dtype  
---  ------                    --------------  -----  
 0   home_Goals                12680 non-null  int64  
 1   away_Goals                12680 non-null  int64  
 2   home_GoalsHalfTime        12680 non-null  int64  
 3   away_GoalsHalfTime        12680 non-null  int64  
 4   home_xGoals               12680 non-null  float64
 5   home_shots                12680 non-null  int64  
 6   home_shotsOnTarget        12680 non-null  int64  
 7   home_deep                 12680 non-null  int64  
 8   home_ppda                 12680 non-null  float64
 9   home_fouls                12680 non-null  int64  
 10  home_corners              12680 non-null  int64  
 11  home_yellowCards          12679 non-null  float64
 12  home_redCards             12680 non-null  int64  
 13  home_total_assists        12680 non-null  int64  
 14  home_total_xAssists       12680 non-null  float64
 15  home_total_key_passes     12680 non-null  int64  
 16  home_total_xGoalsChain    12680 non-null  float64
 17  home_total_xGoalsBuildup  12680 non-null  float64
 18  home_total_yellow_cards   12680 non-null  int64  
 19  home_total_red_cards      12680 non-null  int64  
 20  home_total_blocked_shots  12677 non-null  float64
 21  home_total_saved_shots    12677 non-null  float64
 22  away_xGoals               12680 non-null  float64
 23  away_shots                12680 non-null  int64  
 24  away_shotsOnTarget        12680 non-null  int64  
 25  away_deep                 12680 non-null  int64  
 26  away_ppda                 12680 non-null  float64
 27  away_fouls                12680 non-null  int64  
 28  away_corners              12680 non-null  int64  
 29  away_yellowCards          12680 non-null  float64
 30  away_redCards             12680 non-null  int64  
 31  away_total_assists        12680 non-null  int64  
 32  away_total_xAssists       12680 non-null  float64
 33  away_total_key_passes     12680 non-null  int64  
 34  away_total_xGoalsChain    12680 non-null  float64
 35  away_total_xGoalsBuildup  12680 non-null  float64
 36  away_total_yellow_cards   12680 non-null  int64  
 37  away_total_red_cards      12680 non-null  int64  
 38  away_total_blocked_shots  12672 non-null  float64
 39  away_total_saved_shots    12672 non-null  float64
dtypes: float64(16), int64(24)
memory usage: 4.0 MB
array([[<Axes: title={'center': 'home_Goals'}>,
        <Axes: title={'center': 'away_Goals'}>],
       [<Axes: title={'center': 'home_GoalsHalfTime'}>,
        <Axes: title={'center': 'away_GoalsHalfTime'}>],
       [<Axes: title={'center': 'home_xGoals'}>,
        <Axes: title={'center': 'home_shots'}>],
       [<Axes: title={'center': 'home_shotsOnTarget'}>,
        <Axes: title={'center': 'home_deep'}>],
       [<Axes: title={'center': 'home_ppda'}>,
        <Axes: title={'center': 'home_fouls'}>],
       [<Axes: title={'center': 'home_corners'}>,
        <Axes: title={'center': 'home_yellowCards'}>],
       [<Axes: title={'center': 'home_redCards'}>,
        <Axes: title={'center': 'home_total_assists'}>],
       [<Axes: title={'center': 'home_total_xAssists'}>,
        <Axes: title={'center': 'home_total_key_passes'}>],
       [<Axes: title={'center': 'home_total_xGoalsChain'}>,
        <Axes: title={'center': 'home_total_xGoalsBuildup'}>],
       [<Axes: title={'center': 'home_total_yellow_cards'}>,
        <Axes: title={'center': 'home_total_red_cards'}>],
       [<Axes: title={'center': 'home_total_blocked_shots'}>,
        <Axes: title={'center': 'home_total_saved_shots'}>],
       [<Axes: title={'center': 'away_xGoals'}>,
        <Axes: title={'center': 'away_shots'}>],
       [<Axes: title={'center': 'away_shotsOnTarget'}>,
        <Axes: title={'center': 'away_deep'}>],
       [<Axes: title={'center': 'away_ppda'}>,
        <Axes: title={'center': 'away_fouls'}>],
       [<Axes: title={'center': 'away_corners'}>,
        <Axes: title={'center': 'away_yellowCards'}>],
       [<Axes: title={'center': 'away_redCards'}>,
        <Axes: title={'center': 'away_total_assists'}>],
       [<Axes: title={'center': 'away_total_xAssists'}>,
        <Axes: title={'center': 'away_total_key_passes'}>],
       [<Axes: title={'center': 'away_total_xGoalsChain'}>,
        <Axes: title={'center': 'away_total_xGoalsBuildup'}>],
       [<Axes: title={'center': 'away_total_yellow_cards'}>,
        <Axes: title={'center': 'away_total_red_cards'}>],
       [<Axes: title={'center': 'away_total_blocked_shots'}>,
        <Axes: title={'center': 'away_total_saved_shots'}>]], dtype=object)
No description has been provided for this image

Skewness¶

  skewness
home_total_red_cards 3.421199
home_redCards 3.412354
away_ppda 3.277596
away_total_red_cards 2.893004
away_redCards 2.880568
home_ppda 2.602424
away_total_xGoalsBuildup 2.408524
home_total_xGoalsBuildup 2.191562
away_total_xGoalsChain 1.972202
home_total_xGoalsChain 1.819149
away_GoalsHalfTime 1.412630
away_total_xAssists 1.388455
home_deep 1.358435
away_total_assists 1.337440
home_GoalsHalfTime 1.294446
away_deep 1.282733
home_total_xAssists 1.226424
home_total_assists 1.154587
away_xGoals 1.126784
away_Goals 1.122382
home_xGoals 1.057793
away_total_blocked_shots 1.020413
home_Goals 0.990914
home_total_blocked_shots 0.988462
home_total_saved_shots 0.937809
away_total_saved_shots 0.859399
away_shotsOnTarget 0.745620
home_shotsOnTarget 0.722233
home_total_key_passes 0.701148
away_corners 0.700852
home_corners 0.698146
home_yellowCards 0.665313
away_total_key_passes 0.654491
home_shots 0.622889
home_total_yellow_cards 0.614549
away_shots 0.600380
away_yellowCards 0.530416
away_total_yellow_cards 0.493265
home_fouls 0.387620
away_fouls 0.385974

Y - Target Value¶

<Axes: xlabel='gameresult', ylabel='count'>
No description has been provided for this image
H    5654
A    3854
D    3172
Name: gameresult, dtype: int64

Label Encoding¶

<class 'pandas.core.frame.DataFrame'>
Int64Index: 12680 entries, 0 to 12679
Data columns (total 47 columns):
 #   Column                    Non-Null Count  Dtype         
---  ------                    --------------  -----         
 0   gameID                    12680 non-null  int64         
 1   leagueID                  12680 non-null  int64         
 2   season                    12680 non-null  int64         
 3   date                      12680 non-null  datetime64[ns]
 4   homeTeamID                12680 non-null  int64         
 5   awayTeamID                12680 non-null  int64         
 6   home_Goals                12680 non-null  int64         
 7   away_Goals                12680 non-null  int64         
 8   home_GoalsHalfTime        12680 non-null  int64         
 9   away_GoalsHalfTime        12680 non-null  int64         
 10  home_xGoals               12680 non-null  float64       
 11  home_shots                12680 non-null  int64         
 12  home_shotsOnTarget        12680 non-null  int64         
 13  home_deep                 12680 non-null  int64         
 14  home_ppda                 12680 non-null  float64       
 15  home_fouls                12680 non-null  int64         
 16  home_corners              12680 non-null  int64         
 17  home_yellowCards          12679 non-null  float64       
 18  home_redCards             12680 non-null  int64         
 19  home_total_assists        12680 non-null  int64         
 20  home_total_xAssists       12680 non-null  float64       
 21  home_total_key_passes     12680 non-null  int64         
 22  home_total_xGoalsChain    12680 non-null  float64       
 23  home_total_xGoalsBuildup  12680 non-null  float64       
 24  home_total_yellow_cards   12680 non-null  int64         
 25  home_total_red_cards      12680 non-null  int64         
 26  home_total_blocked_shots  12677 non-null  float64       
 27  home_total_saved_shots    12677 non-null  float64       
 28  away_xGoals               12680 non-null  float64       
 29  away_shots                12680 non-null  int64         
 30  away_shotsOnTarget        12680 non-null  int64         
 31  away_deep                 12680 non-null  int64         
 32  away_ppda                 12680 non-null  float64       
 33  away_fouls                12680 non-null  int64         
 34  away_corners              12680 non-null  int64         
 35  away_yellowCards          12680 non-null  float64       
 36  away_redCards             12680 non-null  int64         
 37  away_total_assists        12680 non-null  int64         
 38  away_total_xAssists       12680 non-null  float64       
 39  away_total_key_passes     12680 non-null  int64         
 40  away_total_xGoalsChain    12680 non-null  float64       
 41  away_total_xGoalsBuildup  12680 non-null  float64       
 42  away_total_yellow_cards   12680 non-null  int64         
 43  away_total_red_cards      12680 non-null  int64         
 44  away_total_blocked_shots  12672 non-null  float64       
 45  away_total_saved_shots    12672 non-null  float64       
 46  gameresult                12680 non-null  int64         
dtypes: datetime64[ns](1), float64(16), int64(30)
memory usage: 4.6 MB

Correlation¶

home_Goals away_Goals home_GoalsHalfTime away_GoalsHalfTime home_xGoals home_shots home_shotsOnTarget home_deep home_ppda home_fouls ... away_redCards away_total_assists away_total_xAssists away_total_key_passes away_total_xGoalsChain away_total_xGoalsBuildup away_total_yellow_cards away_total_red_cards away_total_blocked_shots away_total_saved_shots
home_Goals 1.000000 -0.075874 0.658365 -0.031795 0.608347 0.264244 0.566983 0.244392 -0.016869 -0.076529 ... 0.078895 -0.083797 -0.148038 -0.090315 -0.153945 -0.134761 -0.004914 0.080064 -0.022438 -0.069618
away_Goals -0.075874 1.000000 -0.034998 0.665542 -0.124532 -0.062635 -0.093423 -0.064400 0.081781 0.007621 ... -0.065490 0.826960 0.519682 0.288618 0.497257 0.428352 -0.015292 -0.065757 0.038495 0.098243
home_GoalsHalfTime 0.658365 -0.034998 1.000000 -0.034418 0.384339 0.094571 0.347701 0.113474 0.078865 -0.028519 ... 0.036997 -0.052697 -0.081793 0.005640 -0.091736 -0.075297 -0.000887 0.037384 0.042827 -0.014708
away_GoalsHalfTime -0.031795 0.665542 -0.034418 1.000000 -0.067653 0.020847 -0.028309 -0.000579 -0.009549 -0.004332 ... -0.019710 0.556684 0.342887 0.131941 0.336698 0.286618 -0.009993 -0.017594 -0.033448 0.041945
home_xGoals 0.608347 -0.124532 0.384339 -0.067653 1.000000 0.616286 0.660090 0.475620 -0.176584 -0.094158 ... 0.105635 -0.116247 -0.192328 -0.189961 -0.197390 -0.186312 0.027088 0.109094 -0.109320 -0.121918
home_shots 0.264244 -0.062635 0.094571 0.020847 0.616286 1.000000 0.642636 0.553946 -0.315042 -0.131675 ... 0.096194 -0.057754 -0.194536 -0.271975 -0.193698 -0.196480 0.019934 0.099750 -0.196825 -0.180034
home_shotsOnTarget 0.566983 -0.093423 0.347701 -0.028309 0.660090 0.642636 1.000000 0.418107 -0.173062 -0.088755 ... 0.090349 -0.085829 -0.175267 -0.188754 -0.178441 -0.172387 0.007727 0.093461 -0.115887 -0.115577
home_deep 0.244392 -0.064400 0.113474 -0.000579 0.475620 0.553946 0.418107 1.000000 -0.288527 -0.141393 ... 0.059884 -0.055654 -0.163040 -0.231498 -0.159504 -0.164685 -0.012631 0.062271 -0.157964 -0.140832
home_ppda -0.016869 0.081781 0.078865 -0.009549 -0.176584 -0.315042 -0.173062 -0.288527 1.000000 -0.262892 ... -0.088787 0.073543 0.213590 0.330254 0.279450 0.316622 -0.150219 -0.089900 0.236243 0.194573
home_fouls -0.076529 0.007621 -0.028519 -0.004332 -0.094158 -0.131675 -0.088755 -0.141393 -0.262892 1.000000 ... 0.057662 -0.017833 -0.006445 -0.016599 -0.029025 -0.032408 0.137761 0.056913 0.001880 0.019276
home_corners 0.012084 -0.042092 -0.061367 0.034687 0.254129 0.502022 0.284774 0.367745 -0.308924 -0.097236 ... 0.054653 -0.031431 -0.158470 -0.249516 -0.160727 -0.170428 0.014462 0.056948 -0.172609 -0.161806
home_yellowCards -0.110358 0.094092 -0.078678 0.074368 -0.100449 -0.108091 -0.105352 -0.119676 -0.080147 0.374735 ... 0.076461 0.057627 0.072610 0.046970 0.053507 0.040868 0.207308 0.072798 0.046643 0.047876
home_redCards -0.078921 0.119724 -0.034297 0.076753 -0.098981 -0.103266 -0.091291 -0.099079 0.067771 0.056263 ... 0.072942 0.085363 0.093660 0.094942 0.093364 0.091966 0.051473 0.070405 0.041284 0.054939
home_total_assists 0.821408 -0.076967 0.553991 -0.040397 0.452101 0.217384 0.470423 0.234139 -0.005551 -0.078300 ... 0.047441 -0.074768 -0.128867 -0.079672 -0.133707 -0.119529 -0.026944 0.047461 -0.023088 -0.059386
home_total_xAssists 0.524452 -0.144809 0.342967 -0.090912 0.839731 0.575684 0.588458 0.502796 -0.160965 -0.113044 ... 0.066275 -0.125070 -0.190370 -0.182505 -0.194183 -0.184140 -0.019356 0.069648 -0.109073 -0.104015
home_total_key_passes 0.265327 -0.082397 0.104662 -0.003695 0.570459 0.908958 0.603954 0.570200 -0.277828 -0.140194 ... 0.078618 -0.070950 -0.191964 -0.252754 -0.188689 -0.189453 -0.011155 0.081170 -0.179407 -0.163111
home_total_xGoalsChain 0.502133 -0.148832 0.335959 -0.094924 0.763594 0.536222 0.560903 0.518184 -0.144829 -0.133028 ... 0.071079 -0.129862 -0.191544 -0.185185 -0.195437 -0.184829 -0.032540 0.074187 -0.108836 -0.104444
home_total_xGoalsBuildup 0.436624 -0.130675 0.291998 -0.076459 0.660706 0.494315 0.496945 0.504410 -0.143694 -0.143809 ... 0.068314 -0.112934 -0.182871 -0.184755 -0.186350 -0.177641 -0.032985 0.071309 -0.111712 -0.108549
home_total_yellow_cards -0.106708 0.084468 -0.078712 0.066896 -0.097807 -0.102862 -0.100744 -0.113717 -0.086731 0.371447 ... 0.071962 0.050829 0.065940 0.041077 0.046666 0.034101 0.204574 0.069596 0.046805 0.044491
home_total_red_cards -0.079071 0.119791 -0.034715 0.075612 -0.098225 -0.102541 -0.089901 -0.098108 0.068472 0.055490 ... 0.071957 0.085616 0.094727 0.097042 0.093370 0.091666 0.050395 0.070403 0.043735 0.054188
home_total_blocked_shots -0.005759 0.002203 -0.065756 0.048534 0.265155 0.642414 0.189046 0.365277 -0.223701 -0.084942 ... 0.042664 -0.001887 -0.090538 -0.164384 -0.087446 -0.094405 0.014696 0.045924 -0.117326 -0.116969
home_total_saved_shots 0.108625 -0.066179 0.037200 -0.014952 0.416632 0.609566 0.838550 0.353822 -0.192002 -0.059849 ... 0.061206 -0.053057 -0.123582 -0.167741 -0.123431 -0.127655 0.021516 0.064107 -0.120989 -0.095684
away_xGoals -0.133065 0.606813 -0.060373 0.402162 -0.191813 -0.200735 -0.173102 -0.166742 0.213297 0.028875 ... -0.084375 0.458136 0.848863 0.573686 0.777782 0.671024 -0.037105 -0.085410 0.286165 0.407236
away_shots -0.075582 0.301792 0.024601 0.133999 -0.194928 -0.300408 -0.194557 -0.249095 0.332000 0.025043 ... -0.105038 0.248877 0.596302 0.910390 0.562520 0.517898 -0.068829 -0.106416 0.639642 0.596010
away_shotsOnTarget -0.095580 0.563273 -0.029802 0.362111 -0.159565 -0.180946 -0.142617 -0.150753 0.204663 0.017408 ... -0.084207 0.466609 0.586068 0.608096 0.566065 0.499806 -0.056838 -0.084546 0.198939 0.836928
away_deep -0.098196 0.250748 -0.012637 0.131121 -0.200643 -0.284092 -0.193742 -0.230418 0.369975 -0.010494 ... -0.094636 0.240326 0.517196 0.566438 0.533465 0.516915 -0.061512 -0.095641 0.361719 0.349723
away_ppda 0.124819 -0.053056 0.015049 0.030521 0.256575 0.371108 0.260832 0.399245 -0.188060 -0.239905 ... 0.049124 -0.043909 -0.175852 -0.276124 -0.163613 -0.159283 -0.146195 0.049897 -0.217609 -0.179384
away_fouls -0.038290 -0.020160 -0.031166 -0.003575 -0.036370 -0.030298 -0.034057 -0.076952 -0.213677 0.202256 ... 0.069693 -0.031553 -0.072198 -0.081153 -0.091082 -0.106216 0.373358 0.067834 -0.043602 -0.056888
away_corners -0.022524 0.036902 0.055099 -0.037495 -0.148225 -0.263293 -0.148992 -0.215366 0.244962 -0.013621 ... -0.066369 0.035033 0.252486 0.439652 0.178280 0.171512 -0.049244 -0.068195 0.415698 0.300191
away_yellowCards -0.002762 -0.015331 0.000536 -0.007396 0.034957 0.029367 0.013763 -0.006591 -0.152847 0.144217 ... 0.092816 -0.033586 -0.054613 -0.079562 -0.065720 -0.072139 0.967007 0.092696 -0.037797 -0.056870
away_redCards 0.078895 -0.065490 0.036997 -0.019710 0.105635 0.096194 0.090349 0.059884 -0.088787 0.057662 ... 1.000000 -0.063673 -0.092380 -0.111220 -0.094396 -0.087305 0.001595 0.986328 -0.058244 -0.065492
away_total_assists -0.083797 0.826960 -0.052697 0.556684 -0.116247 -0.057754 -0.085829 -0.055654 0.073543 -0.017833 ... -0.063673 1.000000 0.555497 0.302188 0.488442 0.436490 -0.029330 -0.064511 0.023279 0.065811
away_total_xAssists -0.148038 0.519682 -0.081793 0.342887 -0.192328 -0.194536 -0.175267 -0.163040 0.213590 -0.006445 ... -0.092380 0.555497 1.000000 0.653359 0.851944 0.761763 -0.049377 -0.093378 0.264687 0.381271
away_total_key_passes -0.090315 0.288618 0.005640 0.131941 -0.189961 -0.271975 -0.188754 -0.231498 0.330254 -0.016599 ... -0.111220 0.302188 0.653359 1.000000 0.591786 0.561007 -0.071542 -0.112611 0.559536 0.555142
away_total_xGoalsChain -0.153945 0.497257 -0.091736 0.336698 -0.197390 -0.193698 -0.178441 -0.159504 0.279450 -0.029025 ... -0.094396 0.488442 0.851944 0.591786 1.000000 0.957027 -0.062395 -0.095493 0.252394 0.371318
away_total_xGoalsBuildup -0.134761 0.428352 -0.075297 0.286618 -0.186312 -0.196480 -0.172387 -0.164685 0.316622 -0.032408 ... -0.087305 0.436490 0.761763 0.561007 0.957027 1.000000 -0.068923 -0.088300 0.238704 0.335150
away_total_yellow_cards -0.004914 -0.015292 -0.000887 -0.009993 0.027088 0.019934 0.007727 -0.012631 -0.150219 0.137761 ... 0.001595 -0.029330 -0.049377 -0.071542 -0.062395 -0.068923 1.000000 0.001025 -0.034741 -0.052578
away_total_red_cards 0.080064 -0.065757 0.037384 -0.017594 0.109094 0.099750 0.093461 0.062271 -0.089900 0.056913 ... 0.986328 -0.064511 -0.093378 -0.112611 -0.095493 -0.088300 0.001025 1.000000 -0.058946 -0.065399
away_total_blocked_shots -0.022438 0.038495 0.042827 -0.033448 -0.109320 -0.196825 -0.115887 -0.157964 0.236243 0.001880 ... -0.058244 0.023279 0.264687 0.559536 0.252394 0.238704 -0.034741 -0.058946 1.000000 0.201633
away_total_saved_shots -0.069618 0.098243 -0.014708 0.041945 -0.121918 -0.180034 -0.115577 -0.140832 0.194573 0.019276 ... -0.065492 0.065811 0.381271 0.555142 0.371318 0.335150 -0.052578 -0.065399 0.201633 1.000000

40 rows × 40 columns

No description has been provided for this image
Skewed columns: ['home_xGoals', 'home_deep', 'home_ppda', 'home_redCards', 'home_total_assists', 'home_total_xAssists', 'home_total_xGoalsChain', 'home_total_xGoalsBuildup', 'home_total_red_cards', 'away_xGoals', 'away_deep', 'away_ppda', 'away_redCards', 'away_total_assists', 'away_total_xAssists', 'away_total_xGoalsChain', 'away_total_xGoalsBuildup', 'away_total_red_cards', 'away_total_blocked_shots']
home_xGoals home_shots home_shotsOnTarget home_deep home_ppda home_fouls home_corners home_yellowCards home_redCards home_total_assists home_total_xAssists home_total_key_passes home_total_xGoalsChain home_total_xGoalsBuildup home_total_yellow_cards home_total_red_cards home_total_blocked_shots home_total_saved_shots away_xGoals away_shots away_shotsOnTarget away_deep away_ppda away_fouls away_corners away_yellowCards away_redCards away_total_assists away_total_xAssists away_total_key_passes away_total_xGoalsChain away_total_xGoalsBuildup away_total_yellow_cards away_total_red_cards away_total_blocked_shots away_total_saved_shots
home_xGoals 1.000000 0.629700 0.676205 0.490572 -0.190396 -0.100012 0.266592 -0.100178 -0.098994 0.465161 0.854483 0.583735 0.782750 0.683526 -0.097441 -0.098842 0.279103 0.434595 -0.203855 -0.205391 -0.167961 -0.206215 0.278413 -0.040590 -0.151564 0.033374 0.108958 -0.120000 -0.203569 -0.201094 -0.208703 -0.198090 0.024007 0.112732 -0.121896 -0.126465
home_shots 0.629700 1.000000 0.664302 0.555342 -0.323308 -0.143358 0.521157 -0.111155 -0.100345 0.217487 0.587158 0.919711 0.549993 0.498679 -0.105691 -0.099813 0.668401 0.631525 -0.209873 -0.309636 -0.189016 -0.279483 0.385874 -0.042263 -0.268496 0.023254 0.100343 -0.060860 -0.199048 -0.282818 -0.201666 -0.200045 0.012422 0.103960 -0.208027 -0.186125
home_shotsOnTarget 0.676205 0.664302 1.000000 0.428827 -0.178512 -0.107106 0.298436 -0.115629 -0.091992 0.477356 0.608535 0.627367 0.579888 0.518232 -0.110396 -0.090929 0.206916 0.846096 -0.184316 -0.205030 -0.149913 -0.199017 0.294317 -0.052931 -0.153183 0.006633 0.093368 -0.090249 -0.182726 -0.198427 -0.186717 -0.178356 -0.001791 0.095703 -0.125841 -0.120483
home_deep 0.490572 0.555342 0.428827 1.000000 -0.296802 -0.136951 0.370490 -0.111548 -0.104962 0.242030 0.517134 0.569922 0.535408 0.511825 -0.105468 -0.104991 0.371457 0.359634 -0.175495 -0.259221 -0.160664 -0.232322 0.399520 -0.075542 -0.221489 -0.001940 0.065757 -0.062478 -0.169032 -0.240961 -0.170316 -0.174496 -0.010788 0.067784 -0.168472 -0.145080
home_ppda -0.190396 -0.323308 -0.178512 -0.296802 1.000000 -0.281645 -0.313872 -0.094231 0.071115 -0.016017 -0.171996 -0.287086 -0.160580 -0.150506 -0.101331 0.071442 -0.228746 -0.196579 0.236411 0.351482 0.226997 0.375054 -0.195823 -0.230889 0.259561 -0.156866 -0.092518 0.088857 0.235377 0.350130 0.304012 0.337934 -0.154115 -0.093282 0.252130 0.204920
home_fouls -0.100012 -0.143358 -0.107106 -0.136951 -0.281645 1.000000 -0.110362 0.385085 0.062465 -0.083134 -0.123119 -0.153830 -0.142175 -0.157334 0.383118 0.062517 -0.096945 -0.076214 0.022231 0.013208 0.003008 -0.007650 -0.255536 0.208286 -0.024008 0.144060 0.063086 -0.024134 -0.017566 -0.028954 -0.036836 -0.041257 0.137741 0.062660 -0.006797 0.011567
home_corners 0.266592 0.521157 0.298436 0.370490 -0.313872 -0.110362 1.000000 -0.067538 -0.057537 0.019419 0.274472 0.490454 0.197493 0.189316 -0.062863 -0.057810 0.439845 0.352678 -0.175546 -0.278691 -0.162572 -0.242691 0.282723 -0.039975 -0.241526 0.019761 0.056714 -0.036776 -0.166239 -0.259380 -0.169805 -0.177308 0.017893 0.058181 -0.185355 -0.164746
home_yellowCards -0.100178 -0.111155 -0.115629 -0.111548 -0.094231 0.385085 -0.067538 1.000000 0.114110 -0.108281 -0.128564 -0.129728 -0.136260 -0.143394 0.973463 0.109470 -0.045782 -0.063926 0.107183 0.071566 0.067263 0.055796 -0.194338 0.109031 0.040873 0.219896 0.080509 0.051736 0.066504 0.042219 0.050175 0.035180 0.206522 0.076983 0.050909 0.043538
home_redCards -0.098994 -0.100345 -0.091992 -0.104962 0.071115 0.062465 -0.057537 0.114110 1.000000 -0.074859 -0.104743 -0.100087 -0.106351 -0.098815 0.025573 0.987508 -0.051322 -0.060937 0.127200 0.114046 0.108322 0.069418 -0.108395 0.054753 0.051630 0.066679 0.074056 0.087855 0.098547 0.099486 0.097236 0.096560 0.056351 0.070053 0.043154 0.059742
home_total_assists 0.465161 0.217487 0.477356 0.242030 -0.016017 -0.083134 0.019419 -0.108281 -0.074859 1.000000 0.579225 0.288688 0.517795 0.476599 -0.105412 -0.075766 -0.016259 0.086517 -0.127625 -0.074053 -0.090502 -0.092480 0.140717 -0.059380 -0.020383 -0.029262 0.046470 -0.078238 -0.137439 -0.086594 -0.141238 -0.129521 -0.028969 0.046652 -0.026609 -0.059526
home_total_xAssists 0.854483 0.587158 0.608535 0.517134 -0.171996 -0.123119 0.274472 -0.128564 -0.104743 0.579225 1.000000 0.653030 0.856996 0.777035 -0.123647 -0.105731 0.249675 0.400787 -0.206169 -0.201240 -0.163231 -0.203086 0.281368 -0.082591 -0.139407 -0.018198 0.065356 -0.130115 -0.200552 -0.194720 -0.204799 -0.194711 -0.026243 0.069021 -0.120699 -0.108962
home_total_key_passes 0.583735 0.919711 0.627367 0.569922 -0.287086 -0.153830 0.490454 -0.129728 -0.100087 0.288688 0.653030 1.000000 0.592962 0.554297 -0.123275 -0.099823 0.586108 0.590261 -0.211191 -0.289856 -0.182937 -0.266540 0.389813 -0.083951 -0.247528 -0.009930 0.081627 -0.075177 -0.198009 -0.264384 -0.198369 -0.195042 -0.019871 0.083887 -0.190754 -0.166930
home_total_xGoalsChain 0.782750 0.549993 0.579888 0.535408 -0.160580 -0.142175 0.197493 -0.136260 -0.106351 0.517795 0.856996 0.592962 1.000000 0.958529 -0.131280 -0.106643 0.236754 0.379940 -0.206758 -0.206299 -0.166607 -0.198771 0.354354 -0.119549 -0.142693 -0.030874 0.071512 -0.134487 -0.199873 -0.198253 -0.204251 -0.193188 -0.037941 0.074847 -0.119954 -0.110933
home_total_xGoalsBuildup 0.683526 0.498679 0.518232 0.511825 -0.150506 -0.157334 0.189316 -0.143394 -0.098815 0.476599 0.777035 0.554297 0.958529 1.000000 -0.138500 -0.099283 0.217232 0.340905 -0.197313 -0.205574 -0.161263 -0.200574 0.390176 -0.122227 -0.145322 -0.036486 0.067227 -0.119916 -0.190616 -0.197049 -0.194218 -0.183128 -0.043018 0.070203 -0.122185 -0.113504
home_total_yellow_cards -0.097441 -0.105691 -0.110396 -0.105468 -0.101331 0.383118 -0.062863 0.973463 0.025573 -0.105412 -0.123647 -0.123275 -0.131280 -0.138500 1.000000 0.021401 -0.041837 -0.060541 0.098659 0.066668 0.060499 0.052145 -0.192291 0.104567 0.035944 0.209606 0.074005 0.044254 0.057738 0.035491 0.042370 0.027231 0.204561 0.071735 0.050412 0.040004
home_total_red_cards -0.098842 -0.099813 -0.090929 -0.104991 0.071442 0.062517 -0.057810 0.109470 0.987508 -0.075766 -0.105731 -0.099823 -0.106643 -0.099283 0.021401 1.000000 -0.050775 -0.059096 0.127405 0.116140 0.107994 0.072018 -0.106373 0.052604 0.053228 0.065536 0.072542 0.088198 0.099982 0.101728 0.097658 0.096957 0.055574 0.070382 0.045690 0.059244
home_total_blocked_shots 0.279103 0.668401 0.206916 0.371457 -0.228746 -0.096945 0.439845 -0.045782 -0.051322 -0.016259 0.249675 0.586108 0.236754 0.217232 -0.041837 -0.050775 1.000000 0.239274 -0.101788 -0.195000 -0.105164 -0.164107 0.255742 -0.045225 -0.175217 0.008098 0.042837 -0.006582 -0.098375 -0.173393 -0.098332 -0.103039 0.001249 0.045300 -0.131086 -0.123091
home_total_saved_shots 0.434595 0.631525 0.846096 0.359634 -0.196579 -0.076214 0.352678 -0.063926 -0.060937 0.086517 0.400787 0.590261 0.379940 0.340905 -0.060541 -0.059096 0.239274 1.000000 -0.133556 -0.190979 -0.118275 -0.172087 0.251877 -0.027206 -0.164037 0.025497 0.067075 -0.056103 -0.126198 -0.175353 -0.128300 -0.127885 0.016092 0.069495 -0.127079 -0.100186
away_xGoals -0.203855 -0.209873 -0.184316 -0.175495 0.236411 0.022231 -0.175546 0.107183 0.127200 -0.127625 -0.206169 -0.211191 -0.206758 -0.197313 0.098659 0.127405 -0.101788 -0.133556 1.000000 0.643036 0.670090 0.497107 -0.201229 -0.061005 0.262636 -0.045964 -0.088949 0.478027 0.863438 0.589214 0.793977 0.692429 -0.044912 -0.089721 0.295435 0.421620
away_shots -0.205391 -0.309636 -0.205030 -0.259221 0.351482 0.013208 -0.278691 0.071566 0.114046 -0.074053 -0.201240 -0.289856 -0.206299 -0.205574 0.066668 0.116140 -0.195000 -0.190979 0.643036 1.000000 0.664262 0.552582 -0.320301 -0.085220 0.492374 -0.084820 -0.106623 0.256536 0.603442 0.918409 0.573138 0.515449 -0.078255 -0.107669 0.665774 0.609228
away_shotsOnTarget -0.167961 -0.189016 -0.149913 -0.160664 0.226997 0.003008 -0.162572 0.067263 0.108322 -0.090502 -0.163231 -0.182937 -0.166607 -0.161263 0.060499 0.107994 -0.105164 -0.118275 0.670090 0.664262 1.000000 0.422259 -0.181685 -0.071438 0.274879 -0.073816 -0.085729 0.481895 0.605108 0.624151 0.582147 0.515474 -0.067601 -0.085624 0.207015 0.842336
away_deep -0.206215 -0.279483 -0.199017 -0.232322 0.375054 -0.007650 -0.242691 0.055796 0.069418 -0.092480 -0.203086 -0.266540 -0.198771 -0.200574 0.052145 0.072018 -0.164107 -0.172087 0.497107 0.552582 0.422259 1.000000 -0.308681 -0.092714 0.346901 -0.065620 -0.105110 0.246487 0.522549 0.562436 0.540648 0.512506 -0.060731 -0.105707 0.365309 0.349812
away_ppda 0.278413 0.385874 0.294317 0.399520 -0.195823 -0.255536 0.282723 -0.194338 -0.108395 0.140717 0.281368 0.389813 0.354354 0.390176 -0.192291 -0.106373 0.255742 0.251877 -0.201229 -0.320301 -0.181685 -0.308681 1.000000 -0.356793 -0.297023 -0.146449 0.047168 -0.051769 -0.186143 -0.287929 -0.178794 -0.165204 -0.154527 0.047969 -0.229163 -0.186009
away_fouls -0.040590 -0.042263 -0.052931 -0.075542 -0.230889 0.208286 -0.039975 0.109031 0.054753 -0.059380 -0.082591 -0.083951 -0.119549 -0.122227 0.104567 0.052604 -0.045225 -0.027206 -0.061005 -0.085220 -0.071438 -0.092714 -0.356793 1.000000 -0.061856 0.379716 0.070861 -0.037315 -0.085490 -0.096144 -0.103436 -0.120408 0.376716 0.069503 -0.058328 -0.062740
away_corners -0.151564 -0.268496 -0.153183 -0.221489 0.259561 -0.024008 -0.241526 0.040873 0.051630 -0.020383 -0.139407 -0.247528 -0.142693 -0.145322 0.035944 0.053228 -0.175217 -0.164037 0.262636 0.492374 0.274879 0.346901 -0.297023 -0.061856 1.000000 -0.054198 -0.067630 0.030838 0.249847 0.453112 0.182032 0.169143 -0.051801 -0.069192 0.430931 0.308653
away_yellowCards 0.033374 0.023254 0.006633 -0.001940 -0.156866 0.144060 0.019761 0.219896 0.066679 -0.029262 -0.018198 -0.009930 -0.030874 -0.036486 0.209606 0.065536 0.008098 0.025497 -0.045964 -0.084820 -0.073816 -0.065620 -0.146449 0.379716 -0.054198 1.000000 0.110205 -0.035637 -0.063250 -0.089198 -0.075140 -0.084452 0.961467 0.109668 -0.048180 -0.059991
away_redCards 0.108958 0.100343 0.093368 0.065757 -0.092518 0.063086 0.056714 0.080509 0.074056 0.046470 0.065356 0.081627 0.071512 0.067227 0.074005 0.072542 0.042837 0.067075 -0.088949 -0.106623 -0.085729 -0.105110 0.047168 0.070861 -0.067630 0.110205 1.000000 -0.066396 -0.095620 -0.110921 -0.097863 -0.087379 0.003636 0.986061 -0.061257 -0.063864
away_total_assists -0.120000 -0.060860 -0.090249 -0.062478 0.088857 -0.024134 -0.036776 0.051736 0.087855 -0.078238 -0.130115 -0.075177 -0.134487 -0.119916 0.044254 0.088198 -0.006582 -0.056103 0.478027 0.256536 0.481895 0.246487 -0.051769 -0.037315 0.030838 -0.035637 -0.066396 1.000000 0.580154 0.311816 0.509840 0.464841 -0.032015 -0.067070 0.027596 0.074074
away_total_xAssists -0.203569 -0.199048 -0.182726 -0.169032 0.235377 -0.017566 -0.166239 0.066504 0.098547 -0.137439 -0.200552 -0.198009 -0.199873 -0.190616 0.057738 0.099982 -0.098375 -0.126198 0.863438 0.603442 0.605108 0.522549 -0.186143 -0.085490 0.249847 -0.063250 -0.095620 0.580154 1.000000 0.659626 0.859456 0.776572 -0.058174 -0.096447 0.269369 0.389388
away_total_key_passes -0.201094 -0.282818 -0.198427 -0.240961 0.350130 -0.028954 -0.259380 0.042219 0.099486 -0.086594 -0.194720 -0.264384 -0.198253 -0.197049 0.035491 0.101728 -0.173393 -0.175353 0.589214 0.918409 0.624151 0.562436 -0.287929 -0.096144 0.453112 -0.089198 -0.110921 0.311816 0.659626 1.000000 0.602110 0.557991 -0.081569 -0.111661 0.587933 0.570559
away_total_xGoalsChain -0.208703 -0.201666 -0.186717 -0.170316 0.304012 -0.036836 -0.169805 0.050175 0.097236 -0.141238 -0.204799 -0.198369 -0.204251 -0.194218 0.042370 0.097658 -0.098332 -0.128300 0.793977 0.573138 0.582147 0.540648 -0.178794 -0.103436 0.182032 -0.075140 -0.097863 0.509840 0.859456 0.602110 1.000000 0.957640 -0.070263 -0.098830 0.261093 0.381129
away_total_xGoalsBuildup -0.198090 -0.200045 -0.178356 -0.174496 0.337934 -0.041257 -0.177308 0.035180 0.096560 -0.129521 -0.194711 -0.195042 -0.193188 -0.183128 0.027231 0.096957 -0.103039 -0.127885 0.692429 0.515449 0.515474 0.512506 -0.165204 -0.120408 0.169143 -0.084452 -0.087379 0.464841 0.776572 0.557991 0.957640 1.000000 -0.079839 -0.088471 0.236589 0.336635
away_total_yellow_cards 0.024007 0.012422 -0.001791 -0.010788 -0.154115 0.137741 0.017893 0.206522 0.056351 -0.028969 -0.026243 -0.019871 -0.037941 -0.043018 0.204561 0.055574 0.001249 0.016092 -0.044912 -0.078255 -0.067601 -0.060731 -0.154527 0.376716 -0.051801 0.961467 0.003636 -0.032015 -0.058174 -0.081569 -0.070263 -0.079839 1.000000 0.002613 -0.046032 -0.053219
away_total_red_cards 0.112732 0.103960 0.095703 0.067784 -0.093282 0.062660 0.058181 0.076983 0.070053 0.046652 0.069021 0.083887 0.074847 0.070203 0.071735 0.070382 0.045300 0.069495 -0.089721 -0.107669 -0.085624 -0.105707 0.047969 0.069503 -0.069192 0.109668 0.986061 -0.067070 -0.096447 -0.111661 -0.098830 -0.088471 0.002613 1.000000 -0.061719 -0.063496
away_total_blocked_shots -0.121896 -0.208027 -0.125841 -0.168472 0.252130 -0.006797 -0.185355 0.050909 0.043154 -0.026609 -0.120699 -0.190754 -0.119954 -0.122185 0.050412 0.045690 -0.131086 -0.127079 0.295435 0.665774 0.207015 0.365309 -0.229163 -0.058328 0.430931 -0.048180 -0.061257 0.027596 0.269369 0.587933 0.261093 0.236589 -0.046032 -0.061719 1.000000 0.211137
away_total_saved_shots -0.126465 -0.186125 -0.120483 -0.145080 0.204920 0.011567 -0.164746 0.043538 0.059742 -0.059526 -0.108962 -0.166930 -0.110933 -0.113504 0.040004 0.059244 -0.123091 -0.100186 0.421620 0.609228 0.842336 0.349812 -0.186009 -0.062740 0.308653 -0.059991 -0.063864 0.074074 0.389388 0.570559 0.381129 0.336635 -0.053219 -0.063496 0.211137 1.000000
No description has been provided for this image

1. Why Two Correlation Matrices?¶

Spearman’s Correlation:

Measures monotonic relationships (if one variable goes up, does the other consistently go up or down?). Rank-based, so it’s more robust to outliers and skew. A high Spearman correlation ( 𝜌 ρ close to +1 or –1) means the two variables move together in rank, even if the relationship isn’t strictly linear. Pearson’s Correlation:

Measures linear relationships. More sensitive to outliers and skewed data. A high Pearson correlation ( 𝑟 r close to +1 or –1) means the two variables move together in a roughly linear fashion. By comparing them, you see where variables are consistently related (both Spearman and Pearson are large in magnitude) versus where the relationship might be non-linear (Spearman is strong, but Pearson is weaker) or influenced by outliers/skew.

2. Identify Strong Relationships¶

Look for correlation coefficients with large absolute values (e.g., above ~0.5 or 0.6). For instance, in your Spearman table, you might see:

home_xGoals is strongly correlated (Spearman ≈ 0.84 ≈0.84) with home_total_xAssists, suggesting that as the home team’s expected goals increase, so do their total xAssists in a fairly monotonic way. Similarly, in Pearson’s matrix, home_xGoals and home_total_xAssists also show a strong linear correlation (~0.85). When a relationship is consistently strong in both Spearman and Pearson, that often means you have a robust, near-linear association.

3. Spot Differences Due to Skew¶

Some variables might show a higher correlation under Spearman than Pearson (or vice versa). This often indicates:

Skew or Outliers are influencing Pearson, making the linear correlation weaker or stronger than the monotonic trend. For instance, you might see that home_ppda correlates differently with certain offensive stats in Spearman vs. Pearson. If the difference is large, it can mean a few extreme values are affecting the linear measure.

4. Multicollinearity Concerns¶

If you see multiple columns with correlations near ±0.8 or ±0.9, that suggests multicollinearity. For example, home_shots and home_total_key_passes might be strongly correlated. In modeling:

Dropping one of the highly correlated features or combining them (e.g., via PCA) can help reduce redundancy. Check both Spearman and Pearson for strong clusters of correlated features.

5. Modeling Implications¶

Feature Selection: If two variables are almost duplicates (like home_total_xGoalsChain vs. home_total_xGoalsBuildup with correlation ~0.95), you might choose only one to avoid redundancy. Transformations: If you see big differences between Spearman and Pearson, consider log transforms or other transformations on heavily skewed variables (like home_ppda or home_total_red_cards) before using them in a linear model. Interpretation: Spearman is telling you which pairs move together in rank order—good for capturing monotonic trends. Pearson is telling you which pairs have a more direct linear relationship—useful for linear regression assumptions.

6. Next Steps¶

Highlight Strong Correlations: Identify pairs with |corr| ≥ 0.7 or 0.8. Investigate if they’re truly redundant or if each has unique predictive value. Check for Potential Data Issues: Extremely high correlation can signal duplicates or derived columns (like home_shots vs. home_total_blocked_shots might be partly overlapping). Consider Log Transform: For columns with high skew (e.g., red cards, which are often 0 or 1 with occasional higher values), a log transform or another approach (like a Box-Cox transform) might help linear models. Re-check Pearson correlation after transformation if linear relationships are of interest. Use Spearman for Rank-Based Insights: If your target variable (gameresult) is ordinal or you suspect non-linear but monotonic relationships, Spearman can be more informative.

Summary¶

You have two correlation matrices—Spearman and Pearson—because your data is skewed, and you want to compare monotonic vs. linear relationships. High correlation in both indicates a strong linear relationship that’s robust to skew. Differences between Spearman and Pearson can highlight skew/outliers or non-linear patterns. Use these insights to choose features, consider transformations, and interpret whether variables have linear or simply monotonic relationships. Ultimately, both matrices are valuable: Spearman for robust, monotonic insight and Pearson for linear modeling considerations.

T-Test¶

T-statistic: 51.8705, p-value: 0.0000

You can confidently reject the null hypothesis that there is no difference in home_xGoals between home wins and non-home wins. In practical terms, this means that the home team’s expected goals (home_xGoals) are significantly higher (or lower, depending on the direction of the difference) when they win compared to when they don’t win, and this difference is not due to random chance. If you’re using this variable for predictive modeling or further analysis, it appears to be a very strong indicator of match outcome in your dataset.

ANOVA F-statistic: 1547.4971, p-value: 0.0000

This indicates that the means of the home_xGoals variable differ significantly across all three groups (home win, draw, away win). In other words, at least one group’s mean is different from the others. This justifies that home_xGoals is an important metric in differentiating match outcomes.

Paired t-test: t-statistic = 28.1871, p-value = 0.0000

The paired t-test shows a highly significant difference between home_xGoals and away_xGoals for each game. This suggests that the disparity between home and away expected goals is very informative about the match outcome.

ANOVA for home_shots: F-statistic = 220.7895, p-value = 0.0000
ANOVA for away_shots: F-statistic = 331.2351, p-value = 0.0000
ANOVA for home_deep: F-statistic = 224.1493, p-value = 0.0000
ANOVA for away_deep: F-statistic = 333.4027, p-value = 0.0000
ANOVA for home_ppda: F-statistic = 55.9803, p-value = 0.0000
ANOVA for away_ppda: F-statistic = 81.4031, p-value = 0.0000
ANOVA for home_fouls: F-statistic = 22.5709, p-value = 0.0000
ANOVA for away_fouls: F-statistic = 29.2506, p-value = 0.0000
ANOVA for home_corners: F-statistic = 11.3146, p-value = 0.0000
ANOVA for away_corners: F-statistic = 4.2018, p-value = 0.0150
ANOVA for home_yellowCards: F-statistic = 84.8338, p-value = 0.0000
ANOVA for away_yellowCards: F-statistic = 16.2915, p-value = 0.0000
ANOVA for home_redCards: F-statistic = 108.9195, p-value = 0.0000
ANOVA for away_redCards: F-statistic = 74.0203, p-value = 0.0000

These ANOVA results tell you that for every feature you tested, the mean value differs significantly across the match outcome groups (gameresult), meaning that the variation between groups is far greater than the variation within each group. In other words:

High F-statistics & Near-Zero P-values: Almost every feature (e.g., home_shots, away_shots, homeGoalsHalfTime, etc.) shows a very high F-statistic and a p-value effectively equal to 0. This indicates that the probability of observing such differences by chance is extremely low. For instance, homeGoalsHalfTime and awayGoalsHalfTime have F-statistics over 1400, suggesting that these variables differ drastically across the different match outcomes.

Away_corners Exception: Although the F-statistic for away_corners is lower (4.2018), its p-value is still below a typical significance threshold (p = 0.0150), meaning that even this feature shows statistically significant differences between groups.

2    5654
0    3854
1    3172
Name: gameresult, dtype: int64

0 for away win, 1 for draw, 2 for home win

Overall Observations Most Features Show Statistically Significant Differences For nearly all features, Tukey’s HSD indicates at least some pairwise group differences with p < 0.05, meaning the mean values of these features differ across the three gameresult categories.

Variance Differences in Many Cases Levene’s test often returns a very small p-value (e.g., 0.0000), suggesting the assumption of equal variances (homoscedasticity) does not hold for a number of these features. A few features (like home_fouls, away_fouls) do not exhibit significant variance differences.

Some Features Differ Between All Pairs; Others Only Some

All pairs differ: home_shots, away_shots, home_deep, away_deep, home_redCards, away_redCards, homeGoalsHalfTime, and awayGoalsHalfTime each show significant mean differences among all three pairs (0 vs. 1, 0 vs. 2, and 1 vs. 2). Partial differences: Some features only differ for certain group pairs. For example, home_ppda: differs between (0 vs. 1) and (0 vs. 2), but not (1 vs. 2). home_fouls: differs between (0 vs. 2) and (1 vs. 2), but not (0 vs. 1). away_fouls: differs between (0 vs. 1) and (1 vs. 2), but not (0 vs. 2). Interpreting “Reject=True” Whenever Tukey’s HSD shows “reject = True,” it means there is a statistically significant difference in the mean of that feature between those two gameresult groups at the chosen alpha level (0.05 by default).

High-Level Takeaways Shots (home_shots, away_shots) and Deep (home_deep, away_deep) are strongly different among all outcomes, indicating that teams’ shot counts and attacking penetration vary considerably depending on the result. Half-Time Goals (homeGoalsHalfTime, awayGoalsHalfTime) also differ significantly for all pairs, suggesting that scoring patterns before halftime strongly correlate with the final match outcome. Red Cards (home_redCards, away_redCards) differ across all three groups, but the differences are quite small in absolute terms. Still, they are statistically meaningful. PPDA (home_ppda, away_ppda) shows partial differences. For instance, away_ppda is significantly different between (0 vs. 2) and (1 vs. 2) but not between (0 vs. 1). Fouls and Corners often differ for two of the pairs but not all three, indicating these stats may be somewhat less discriminative than others like shots or goals. Variance Differences: In many cases (e.g., home_ppda, away_ppda, homeShots), Levene’s test reveals that the spread of values is not the same across all outcome groups, meaning you should be cautious if you assume homoscedasticity for further parametric tests. Practical Implications Predictive Modeling: Features such as shots, xGoals (if available), deep completions, and first-half goals appear highly indicative of final match outcomes. They may serve as strong predictors in classification models.

Further Analysis:

Pairwise differences from Tukey’s HSD highlight where exactly the mean of a feature is higher or lower. For example, if “group2” (possibly home wins) consistently has higher home_shots than “group0” or “group1,” that indicates the home team’s shot volume is a good indicator of a home win. Variance differences from Levene’s test mean you should check assumptions (e.g., for ANOVA or regression). If variances are unequal, you might use Welch’s ANOVA or heteroscedastic-robust methods. Game Strategy Insights:

High shot counts and high attacking penetration (deep completions) are strongly tied to winning or losing. Early goals (reflected in half-time goals) also have a major effect on eventual outcomes. Discipline stats (cards, fouls) do show some differences but are generally not as consistently discriminative across all pairs as shots and goals are. Conclusion The PDF results confirm that most match statistics differ significantly across the three outcome categories. While some features (e.g., home_ppda, home_fouls) only differ for specific pairwise comparisons, many others (shots, deep completions, half-time goals) show universal differences among all three groups. Furthermore, unequal variances are common, so any parametric modeling approach should account for heteroscedasticity or use robust methods.

Overall, these findings suggest that the identified features—particularly shots, half-time goals, and deep completions—are strongly associated with match outcomes and could be valuable for predictive or explanatory models of football results.

Chi-Square¶

No description has been provided for this image
No description has been provided for this image
No description has been provided for this image
gameID leagueID season date homeTeamID awayTeamID home_Goals away_Goals home_GoalsHalfTime away_GoalsHalfTime ... away_total_key_passes away_total_xGoalsChain away_total_xGoalsBuildup away_total_yellow_cards away_total_red_cards away_total_blocked_shots away_total_saved_shots gameresult home_redCards_binary away_redCards_binary
4140 4888 2 2014 2015-03-02 19:45:00 95 98 1 1 0 0 ... 9 1.909902 0.964562 5 0 4.0 0.0 1 Yes No

1 rows × 49 columns

gameID leagueID season date homeTeamID awayTeamID home_Goals away_Goals home_GoalsHalfTime away_GoalsHalfTime ... away_total_key_passes away_total_xGoalsChain away_total_xGoalsBuildup away_total_yellow_cards away_total_red_cards away_total_blocked_shots away_total_saved_shots gameresult home_redCards_binary away_redCards_binary

0 rows × 49 columns

2.0    3630
1.0    3433
3.0    2252
0.0    1813
4.0    1038
5.0     348
6.0     128
7.0      30
8.0       7
Name: home_yellowCards, dtype: int64
2.0    3605
1.0    3045
3.0    2691
0.0    1356
4.0    1319
5.0     480
6.0     141
7.0      35
8.0       6
9.0       2
Name: away_yellowCards, dtype: int64
2     3631
1     3433
3     2252
0     1813
4     1038
5+     513
Name: home_yellowCards_cat, dtype: int64
2     3605
1     3045
3     2691
0     1356
4     1319
5+     664
Name: away_yellowCards_cat, dtype: int64
No description has been provided for this image
=== Chi-square test: gameresult vs. leagueID ===
Contingency Table:
leagueID       1     2    3     4     5
gameresult                             
0            845   837  651   764   757
1            629   668  532   682   661
2           1186  1155  959  1214  1140

Expected Frequencies:
leagueID              1            2           3            4            5
gameresult                                                                
0            808.488959   808.488959  651.046372   808.488959   777.486751
1            665.419558   665.419558  535.837855   665.419558   639.903470
2           1186.091483  1186.091483  955.115773  1186.091483  1140.609779

Chi2 Statistic: 10.2695
p-value: 0.2466
Degrees of Freedom: 8
-------------------------------------------------- 

=== Chi-square test: gameresult vs. home_yellowCards_cat ===
Contingency Table:
home_yellowCards_cat     0     1     2    3    4   5+
gameresult                                           
0                      438   977  1161  731  361  186
1                      372   775   949  622  320  134
2                     1003  1681  1521  899  357  193

Expected Frequencies:
home_yellowCards_cat           0            1            2            3  \
gameresult                                                                
0                     551.049054  1043.437066  1103.617823   684.480126   
1                     453.535962   858.791483   908.322713   563.355205   
2                     808.414984  1530.771451  1619.059464  1004.164669   

home_yellowCards_cat           4          5+  
gameresult                                    
0                     315.493060  155.922871  
1                     259.663722  128.330915  
2                     462.843218  228.746215  

Chi2 Statistic: 199.2871
p-value: 0.0000
Degrees of Freedom: 10
-------------------------------------------------- 

=== Chi-square test: gameresult vs. away_yellowCards_cat ===
Contingency Table:
away_yellowCards_cat    0     1     2     3    4   5+
gameresult                                           
0                     472   969  1050   798  359  206
1                     290   707   910   688  392  185
2                     594  1369  1645  1205  568  273

Expected Frequencies:
away_yellowCards_cat           0            1            2            3  \
gameresult                                                                
0                     412.147003   925.507098  1095.715300   817.911199   
1                     339.213880   761.730284   901.818612   673.174448   
2                     604.639117  1357.762618  1607.466088  1199.914353   

away_yellowCards_cat           4          5+  
gameresult                                    
0                     400.901104  201.818297  
1                     329.958044  166.104732  
2                     588.140852  296.076972  

Chi2 Statistic: 46.5489
p-value: 0.0000
Degrees of Freedom: 10
-------------------------------------------------- 

=== Chi-square test: gameresult vs. home_redCards_binary ===
Contingency Table:
home_redCards_binary    No  Yes
gameresult                     
0                     3345  509
1                     2878  294
2                     5379  275

Expected Frequencies:
home_redCards_binary           No         Yes
gameresult                                   
0                     3526.349211  327.650789
1                     2902.329968  269.670032
2                     5173.320820  480.679180

Chi2 Statistic: 208.2850
p-value: 0.0000
Degrees of Freedom: 2
-------------------------------------------------- 

=== Chi-square test: gameresult vs. away_redCards_binary ===
Contingency Table:
away_redCards_binary    No  Yes
gameresult                     
0                     3605  249
1                     2828  344
2                     4851  803

Expected Frequencies:
away_redCards_binary           No         Yes
gameresult                                   
0                     3429.695268  424.304732
1                     2822.779811  349.220189
2                     5031.524921  622.475079

Chi2 Statistic: 140.3080
p-value: 0.0000
Degrees of Freedom: 2
-------------------------------------------------- 

1. gameresult vs. leagueID¶

p-value = 0.2466 (above 0.05) There is no statistically significant association between the match outcome (gameresult) and leagueID. In other words, the distribution of home wins/draws/away wins doesn’t differ enough across leagues to be considered non-random at the 5% significance level.

2. gameresult vs. home_yellowCards_cat¶

p-value = 0.0000 (well below 0.05) Statistically significant association exists between game result and the binned home yellow-card categories (0, 1, 2, 3, 4, 5+). This implies that the distribution of home yellow cards (how many the home team gets) is not independent of whether the home team ended up winning, drawing, or losing.

3. gameresult vs. away_yellowCards_cat¶

p-value = 0.0000 Similarly, a significant association between game result and the binned away yellow-card categories. The number of yellow cards the away team receives is not independent of the match outcome.

4. gameresult vs. home_redCards_binary¶

p-value = 0.0000 There’s a significant association between game result and whether the home team had no red cards (No) or at least one red card (Yes). In simpler terms, whether the home team sees red cards correlates with whether the home team won, drew, or lost.

5. gameresult vs. away_redCards_binary¶

p-value = 0.0000 Another significant association: the away team’s red-card status (No vs. Yes) is tied to the final outcome.

Overall Takeaways¶

LeagueID does not appear to affect the distribution of match outcomes (H/D/A). Cards (both yellow and red, for home and away) do show a statistically significant association with match outcomes. This does not tell you which categories lead to more wins or losses, only that they are not independent. For directionality or deeper insight, you could: Look at residuals (which categories are over- or under-represented). Conduct post-hoc tests (e.g., pairwise comparisons). Examine the contingency tables in more detail (e.g., more red cards in losing teams?). Because p-values for all card-related tests are effectively zero, you can conclude that the distribution of match outcomes is strongly associated with the number of cards (yellow or red) teams receive.

gameID leagueID season date homeTeamID awayTeamID home_Goals away_Goals home_GoalsHalfTime away_GoalsHalfTime ... away_total_xGoalsBuildup away_total_yellow_cards away_total_red_cards away_total_blocked_shots away_total_saved_shots gameresult home_redCards_binary away_redCards_binary home_yellowCards_cat away_yellowCards_cat
0 81 1 2015 2015-08-08 15:45:00 89 82 1 0 1 0 ... 0.811549 3 0 3.0 4.0 2 No No 2 3
1 82 1 2015 2015-08-08 18:00:00 73 71 0 1 0 0 ... 0.736815 4 0 2.0 2.0 0 No No 3 4
2 83 1 2015 2015-08-08 18:00:00 72 90 2 2 0 1 ... 1.030588 2 0 3.0 3.0 1 No No 1 2
3 84 1 2015 2015-08-08 18:00:00 75 77 4 2 3 0 ... 5.617276 4 0 2.0 3.0 2 No No 2 4
4 85 1 2015 2015-08-08 18:00:00 79 78 1 3 0 1 ... 8.554974 0 0 2.0 4.0 0 No No 1 0
... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ...
12675 16131 5 2020 2021-05-23 19:00:00 168 166 1 2 1 1 ... 0.715843 2 0 1.0 3.0 0 No No 2 2
12676 16132 5 2020 2021-05-23 19:00:00 177 176 1 2 1 1 ... 1.758012 1 0 4.0 3.0 0 No No 1 1
12677 16133 5 2020 2021-05-23 19:00:00 163 235 2 0 1 0 ... 0.544502 0 0 0.0 2.0 2 No No 1 0
12678 16134 5 2020 2021-05-23 19:00:00 175 181 0 1 0 1 ... 0.764512 0 0 1.0 1.0 0 No No 1 0
12679 16135 5 2020 2021-05-23 19:00:00 225 179 1 1 1 0 ... 0.421488 1 0 2.0 0.0 1 No No 1 1

12680 rows × 51 columns

gameID leagueID season date homeTeamID awayTeamID home_Goals away_Goals home_GoalsHalfTime away_GoalsHalfTime ... away_total_xGoalsBuildup away_total_yellow_cards away_total_red_cards away_total_blocked_shots away_total_saved_shots gameresult home_redCards_binary away_redCards_binary home_yellowCards_cat away_yellowCards_cat
5270 6018 5 2014 2015-01-18 20:00:00 164 169 2 1 0 0 ... 0.348398 1 0 NaN NaN 2 Yes No 2 1
7592 9486 1 2018 2019-03-02 15:00:00 73 88 0 1 0 0 ... 2.990505 2 0 6.0 6.0 0 No No 1 2
11751 15207 3 2020 2020-11-21 14:30:00 262 119 1 2 0 1 ... 2.241605 1 0 2.0 2.0 0 No No 0 1

3 rows × 51 columns

gameID leagueID season date homeTeamID awayTeamID home_Goals away_Goals home_GoalsHalfTime away_GoalsHalfTime ... away_total_xGoalsBuildup away_total_yellow_cards away_total_red_cards away_total_blocked_shots away_total_saved_shots gameresult home_redCards_binary away_redCards_binary home_yellowCards_cat away_yellowCards_cat
5270 6018 5 2014 2015-01-18 20:00:00 164 169 2 1 0 0 ... 0.348398 1 0 NaN NaN 2 Yes No 2 1
7592 9486 1 2018 2019-03-02 15:00:00 73 88 0 1 0 0 ... 2.990505 2 0 6.0 6.0 0 No No 1 2
11751 15207 3 2020 2020-11-21 14:30:00 262 119 1 2 0 1 ... 2.241605 1 0 2.0 2.0 0 No No 0 1

3 rows × 51 columns

gameID leagueID season date homeTeamID awayTeamID home_Goals away_Goals home_GoalsHalfTime away_GoalsHalfTime ... away_total_xGoalsBuildup away_total_yellow_cards away_total_red_cards away_total_blocked_shots away_total_saved_shots gameresult home_redCards_binary away_redCards_binary home_yellowCards_cat away_yellowCards_cat
1447 1528 4 2015 2015-11-29 23:30:00 138 146 1 0 0 0 ... 0.000000 2 2 NaN NaN 2 No Yes 3 2
2267 2348 5 2016 2016-10-23 22:45:00 161 164 0 0 0 0 ... 0.000000 2 0 NaN NaN 1 No No 0 2
4644 5392 3 2014 2014-10-18 14:30:00 117 123 6 0 4 0 ... 0.000000 1 0 NaN NaN 2 No No 0 1
5270 6018 5 2014 2015-01-18 20:00:00 164 169 2 1 0 0 ... 0.348398 1 0 NaN NaN 2 Yes No 2 1
5771 7413 1 2017 2018-03-10 15:00:00 219 84 0 0 0 0 ... 0.000000 3 1 NaN NaN 1 No Yes 2 3
6179 7821 2 2017 2018-04-17 18:45:00 106 116 4 0 1 0 ... 0.000000 1 0 NaN NaN 2 No No 1 1
11644 15100 4 2020 2021-04-18 14:15:00 143 156 5 0 2 0 ... 0.000000 1 0 NaN NaN 2 No No 0 1
12642 16098 5 2020 2021-05-01 19:00:00 160 170 2 0 1 0 ... 0.000000 1 1 NaN NaN 2 No Yes 2 3

8 rows × 51 columns

gameID leagueID season date homeTeamID awayTeamID home_Goals away_Goals home_GoalsHalfTime away_GoalsHalfTime ... away_total_xGoalsBuildup away_total_yellow_cards away_total_red_cards away_total_blocked_shots away_total_saved_shots gameresult home_redCards_binary away_redCards_binary home_yellowCards_cat away_yellowCards_cat
1447 1528 4 2015 2015-11-29 23:30:00 138 146 1 0 0 0 ... 0.000000 2 2 NaN NaN 2 No Yes 3 2
2267 2348 5 2016 2016-10-23 22:45:00 161 164 0 0 0 0 ... 0.000000 2 0 NaN NaN 1 No No 0 2
4644 5392 3 2014 2014-10-18 14:30:00 117 123 6 0 4 0 ... 0.000000 1 0 NaN NaN 2 No No 0 1
5270 6018 5 2014 2015-01-18 20:00:00 164 169 2 1 0 0 ... 0.348398 1 0 NaN NaN 2 Yes No 2 1
5771 7413 1 2017 2018-03-10 15:00:00 219 84 0 0 0 0 ... 0.000000 3 1 NaN NaN 1 No Yes 2 3
6179 7821 2 2017 2018-04-17 18:45:00 106 116 4 0 1 0 ... 0.000000 1 0 NaN NaN 2 No No 1 1
11644 15100 4 2020 2021-04-18 14:15:00 143 156 5 0 2 0 ... 0.000000 1 0 NaN NaN 2 No No 0 1
12642 16098 5 2020 2021-05-01 19:00:00 160 170 2 0 1 0 ... 0.000000 1 1 NaN NaN 2 No Yes 2 3

8 rows × 51 columns

(12680, 40)